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Abstract 

We present a simple mechanism for generating undirected scale-free networks using 
random walkers, where the network growth is determined by choosing parent vertices 
by sequential random walks. We show that this mechanism produces scale-free net- 
works with degree exponent 7 = 3 and clustering coefficients depending on random 
walk length. The mechanism can be interpreted in terms of preferential attachment 
without explicit knowledge of node degrees. 
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Since the seminal paper by Watts and Strogatz much research has focused 
on the properties of small- world and scale- free networks 0, 0, 0|, as these 
have been found to resemble many naturally occurring networks, such as so- 
cial networks (1, 0|, scientific collaboration networks Q, the WWW and 
metabolic networks [10]. Most naturally occurring networks can be character- 
ized by a high degree of clustering together with small average node-to-node 
distance []]]. In addition, these networks often display a power-law degree 
distribution p(k) oc fc~ 7 , where k denotes vertex degree and 7 the degree ex- 
ponent, which is typically observed to lie within the range 7 ~ 2 — 3. The 
emergent power law was originally explained by Barabasi and Albert (BA) in 
terms of combining network growth and preferential attachment [ill]. The BA 
preferential attachment model simply states that when a new node is added 
to the network, it is preferentially linked to nodes already possessing a large 
number of links. This intuitive mechanism results in scale-free networks with 
7 = 3. 
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Over the recent years, several new models for generating scale-free networks 



rs, several new moaeis lorgi 
have been proposed [13, EH li, IB, IB, 12, IB,EB|- One motivation has been to 
better capture the clustering properties of real- world networks [IB, IB, 12, H|- 
as these tend to exhibit far larger degree of clustering than BA networks. 
Furthermore, the original BA model does not explicitly state how the pref- 
erential attachment comes to be. The algorithm utilizes global knowledge of 
vertex degrees but in the case of real-world networks such as social networks 
or the WWW, new "vertices" joining these networks rarely have such knowl- 
edge. Despite of that, power laws emerge. Thus, determining growth-directing 
rules which only utilize local information on vertex degrees and connections 
is of importance. These ideas have been elaborated in Refs. [IB, EB], where it 
was shown that in directed networks, random walkers moving along the edges 
of a network and probabilistically creating new links to their current loca- 
tions eventually results in scale-free degree distributions. The idea of reaching 
highly connected vertices by following random links was also utilized by Cohen 
et al. (iB| in the context of effective immunization strategies. 

Here, we present a simple undirected network growth mechanism based on 
random walks of fixed length, and show that it leads to a scale-free network 
structure with BA degree exponent 7 = 3. Furthermore, the clustering proper- 
ties of the network are determined by the random walk length. The algorithm 
goes as follows: 

(i) The network is initialized with m vertices, connected to each other. 

(ii) A random vertex is chosen as the starting point of the random walk. 

(iii) At each step of the walk, the walker moves to a randomly chosen neighbor 
of the current vertex. After I steps, the vertex at which the random walker has 
arrived is marked. The walk is repeatedly continued for I steps until m < m 
different vertices are marked. 



(iv) A new vertex is added to the network and connected to the m marked 
vertices by undirected links, and the whole process is repeated starting from 
step (ii), until the network has grown to the desired size of N vertices. 

Note that in step (iii), the walker is allowed to trace its steps backwards. Thus 
the walk is not self-avoiding, as it would otherwise easily get stuck. If the 
walker arrives at an already marked vertex after I steps, a new /-length walk 
is started from that vertex. 

In this algorithm, preferential attachment follows from the fact that the prob- 
ability of the random walker ending up at a highly connected vertex is higher 
than that of it ending up at a vertex with less connections. In the context 
of e.g. WWW or social networks, the idea is intuitively appealing. Analo- 
gously, we tend to learn to know new people through those people we already 



2 



know, which quickly leads us to "popular" persons without explicitly looking 
for them. 



In the following, we show that choosing vertices by the random walk method 
is equivalent to the BA preferential attacment rule [HI], stating that the prob- 
ability P of node i being chosen depends on its degree ki\ 

pa) = £p (i) 

Here J2j kj denotes the sum over the degrees of all nodes, that is, the total 
number of connections within the network. Let us define P(A) as the proba- 
bility that vertex A is chosen as the initial vertex, P(A) = l/N, and P(B) as 
the probability that we arrive at its neighbor vertex B by following one of the 
k,A links attached to A. We can derive an expression for P(B) by utilizing the 
Bayes rule: 

P(B\A)P(A) 

where the conditional probability P(B\A) denotes the probability of arriv- 
ing at B if A is chosen as the starting vertex. This probability equals the 
probability that of the fc^ links out of node A, the correct link is followed: 

P(B\A) = i-. (3) 

Likewise, the conditional probability P(A\B), i.e. the probability that if the 
walk arrived at B it originated at A, can be written as 

P(A\B) = * (4) 
Kb 

Combining all the above we arrive at 

If we now continue the walk and utilize the derived result for P(B) in cal- 
culating the probability of the walker ending up at node C within one single 

step, we get 

p < c > = m> (6 » 

as ks cancels out. Thus, the random walk length I does not influence the 
probabilities of nodes to be chosen at all, although it plays a role in determining 
the clustering properties of the network, as we shall see below. Finally, as node 
A was chosen randomly, its degree k& equals, on the average, the average 
degree (k) of the network, and since Nk& = N(k) = we arrive at 

the BA preferencial attachment rule (1), with the well known consequence 
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for the degree distribution p(k) oc A; -3 , i.e. the degree exponent 7 = 3. It 
should be noted that degree-degree correlations could in principle influence 
this outcome; however, no significant correlations were found in simulated 
networks generated by our random walk method. 

Figure 1 displays the degree distribution p(k) versus k on log-log scale, ob- 
tained by simulations for m = 2,4, 8, 16, with N = 10 6 , and averaged over 100 
runs each. The random walk length was chosen I — 1. In this figure, the solid 
lines indicate corresponding BA degree distributions of the form (ill, 15^. . 

2m(m + 1) 

^ = *(* + !)(* + 2) - (7) 

which at the limit m ^> 1, k ^> 1 can be written in the common form p(k) = 
2m 2 / k 3 . It is evident that the simulated degree distributions match very well 
with the theoretical ones. We have also repeated the same runs for larger 
values of I and obtained very similar results, confirming the fact that the 
random walk length does not seem to influence the degree distribution. 

However, as mentioned above, the random walk length / is not irrelevant for 
the network properties. When m > 1, one can expect that short random walk 
lengths give rise to highly clustered networks. The degree of clustering should 
be especially high when I — 1, as every growth step then results in the for- 
mation of at least m — 1 triads of connected vertices. Clustering is measured 
in terms of the clustering coefficient, which denotes to what degree the neigh- 
bors of a vertex are also neighbors to each other, and it has maximum value 
Cmax = 1 for a fully connected network. The average clustering coefficient C 
of the whole network []J is defined as the ratio of the existing links Ei between 
neighbors of vertex i to the possible number of such connections, averaged 
over the whole network: 



The case m = 2 is especially interesting: one can easily see that with I = 
1, the random walk model is identical to the Holme-Kim model 0| with 
triad formation probability P t — 1. Furthermore, the clustering properties 
are similar to the Dorogovtsev-Mendes-Samukhin model as well as to 
the highly clustered Klemm-Egufluz model In this case, a new triad is 
formed every time a new vertex is added to the network. We can calculate the 
clustering coefficient using a slight variation of the rate equation approach 
for each vertex dE{k)/dk = 1 with initial condition E{2) = 1, resulting in 
E{k) = k — 1. Thus, the clustering coefficient of a vertex of degree k equals 
C{k) = 2/k. Now the average clustering coefficient is obtained by using the 
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degree distribution of Eq. (7): 




oo 



2 2m(m + 1) 



(9) 



kk(k + l)(k + 2) 



yielding a numerical value of C m=2i z=i ~ 0.74. For I = 3, one can easily 
see that new triads are most likely formed by the walker returning to the 
previous vertex at steps 2 or 3. The probabilities of both the above cases can 
be approximated to be l/(k), thus, on the average, dE/dk « 2/(k) = 1/2. 
Setting E(2) = 1/2 we get C(k) « 1/fc, yielding C m=2 ,i=3 w 0.37. For larger 
odd values of I, the probability of the walker returning to one of the neighbors 
of the first marked node decreases, causing the clustering coefficient to decrease 
as well. It is interesting to note that a C(k) ~ A; _1 -dependency similar to the 
I = 1 and I = 3 cases has been observed in several real- world networks (l8| . 
The clustering coefficient distribution C(k) = 2/k of the I = 1 case is also 
similar to the deterministic scale-free graph of Ref. (l5| . 

Estimating the clustering coefficient values for even random walk lengths turns 
out to be more difficult. In the I = 2 case, dE/dk ~ E/(k(k)), i.e. a triad 
is formed only if the walker at step 2 follows one of the already existing E 
neighbor-neighbor links out of ~ k(k) possible links. This equation suggests 
that for (low) even values of I, the triad formation probability is significantly 
lower than for odd values of I. It is also influenced by the initial conditions 
(for / = 2, if there are no clusters in the initial configuration, no triads can 
be formed at all), and the configuration of links added at the initial stages of 
growth. Unlike for odd /-values, one can expect an increase in clustering with 
increasing even /, as the probability of the walker returning to the vicinity of 
its starting point increases. 

Figure 2 illustrates the dependence of C on I for m = 2 and 4, calculated by 
averaging over 100 runs of networks with size N = 25, 000. The I = 1 and 
I = 3 values for the m = 2 case match with the above predictions, as does the 
overall shape of the curves. At low enough values of I, the clustering coefficient 
depends heavily on the random walk length. At odd values of / the generated 
networks are highly clustered, but there is a large difference between even and 
odd values of I. This difference disappears for large I, while at the same time 
the clustering coefficient converges to a constant value. The probable reason 
for this is that I becomes large enough for the m nodes to be uncorrelated, 
and thus increasing it further has no more an effect. Furthermore, investigation 
of the clustering coefficients of networks generated in individual runs reveals 
that when / is small, there is only little scatter for odd values of I, whereas 
the coefficients can differ greatly for even values of I. 

Finally, we have investigated the dependency of C on the network size. Figure 
3 illustrates C as function of N for m = 4 and 8, and for several values of I. 
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When the random walks are short, C appears quite independent of N . As dis- 
cussed above, increasing (odd) I results in a decrease in clustering. However, as 
/ becomes large enough, we obtain cluste ring similar to that of BA networks, 



where C(N) oc (In N) 2 /N for m ^> 1 1 71. l2ra| . The solid lines indicate cluster 



ing coefficients C(N) calculated by using the asymptotically similar formula 
(17) of Ref. which is more accurate at low m. The theoretical curves show 
good agreement with our data. Thus, the BA preferential attachment model 
could be considered as the limiting case of our model for I ^> 1. 

In conclusion, we have presented a random-walk-based undirected network 
growth mechanism which results in scale-free networks with degree exponent 
7 = 3 without utilizing any knowledge of node degrees. Furthermore, we have 
analytically shown that this mechanism leads to the Barabasi-Albert prefer- 
ential attachment rule for any random walk length, and thus, somewhat sur- 
prisingly, even one-step random walks produce scale-free degree distributions. 
This clearly illustrates that growing networks can self-organize into scale-free 
structures even when based on simple, local growth rules. We have also shown 
that the clustering degree of the network depends on the random walk length 
in addition to the network size. This degree becomes similar to that of BA 
networks, when the random walks are long enough. 
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Fig. 1. Results of simulated degree distributions p(k) as function of k, calculated 
by averaging over 100 runs with networks grown to size N = 10 6 , for m = 2 (o), 
m = 4 (□), m = 8 (v) and m = 16 (<>)• The networks were generated using random 
walks of length 1 = 1. The solid lines indicate respective Barabasi- Albert degree 
distributions. 
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Fig. 3. Clustering coefficient C as function of network size N for a) m = 4, b) m = 8, 
with random walk lengths I = 1 (□), Z = 3 (*), I = 5 (V), I = 15 (+), and I = 25 
(o), calculated as averages over 100 simulated runs. The dashed lines serve as guides 
to the eye and the solid line displays the theoretical prediction j^l for clustering 
coefficients of Barabasi- Albert-networks. 
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